On this notebook we will download, prepare and process geospatial data (population and points of interest) from the capital cities/areas from three caribean countries Barbados 🇧🇧, Jamaica 🇯🇲, and Trindad and Tobago 🇹🇹.
All these steps will be done using mainly UrbanPy, a Python library developed by IDB to do urban data science fast and flexible.
import urbanpy as up
import pandas as pd
import contextily as cx
import plotly
import plotly.express as px
import matplotlib.pyplot as plt
import warnings
from matplotlib.lines import Line2D
from tqdm.auto import tqdm
warnings.filterwarnings('ignore')
tqdm.pandas() # Activate progress bar
Bridgetown is the capital of Barbados, and it is located in the Saint Michael region. We will download its administrative boundaries using nominatim_osm function from the download module.
saint_michael = up.download.nominatim_osm('Saint Michael, Barbados')
saint_michael.plot()
<AxesSubplot:>
Similarly we will download administrative boundaries for Kingston 🇯🇲 and Port of Spain 🇹🇹
kingston = up.download.nominatim_osm('Kingston and Saint Andrew Corporation')
port_of_spain = up.download.nominatim_osm('Port of Spain, Trinidad and Tobago', 1)
fig, axes = plt.subplots(1, 3, figsize=(10, 5))
fig.suptitle('Administrative Boundaries')
saint_michael.plot(ax=axes[0])
axes[0].set_title('Saint Michael')
kingston.plot(ax=axes[1])
axes[1].set_title('Kingston')
port_of_spain.plot(ax=axes[2])
axes[2].set_title('Port of Spain')
plt.tight_layout()
plt.show()
First we download population data for each country from the Humanitarian Data Exchange Portal.
*Source: Facebook Connectivity Lab and Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. High Resolution Settlement Layer (HRSL). Source imagery for HRSL © 2016 DigitalGlobe. Accessed 9 June 2022.*
barbados_pop = up.download.hdx_dataset("ee2abe39-e46b-4708-9204-7320433fc351/resource/072657ca-cc34-4198-aa37-2441f9a08ff6/download/brb_general_2020_csv.zip")
jamaica_pop = up.download.hdx_dataset("f45181ca-61b0-4132-bc01-52461944c3aa/resource/7c22d93a-cf35-47ac-bd7b-032a81276aa8/download/population_jam_2018-10-01.csv.zip")
trinidad_and_tobago_pop = up.download.hdx_dataset("33a00cd9-eb08-4236-bcad-207a52e20c83/resource/3110ad54-3a78-4776-9ff7-8a0e09c54a57/download/population_tto_2018-10-01.csv.zip")
Then we need to filter the data for each city
saint_michael_pop = up.geom.filter_population(barbados_pop, saint_michael)
kingston_pop = up.geom.filter_population(jamaica_pop, kingston)
port_of_spain_pop = up.geom.filter_population(trinidad_and_tobago_pop, port_of_spain)
Let's visualize the data we downloaded
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
plt.suptitle('Population at 30m x 30m resolution (points)')
saint_michael_pop.plot("brb_general_2020", ax=axes[0])
axes[0].set_title('Kingston')
kingston_pop.plot("population_2015", ax=axes[1])
axes[1].set_title('Kingston')
port_of_spain_pop.plot("population_2015", ax=axes[2])
axes[2].set_title('Port of Spain')
plt.tight_layout()
plt.show()
As you can see, the high resolution of the points does not allow a clear visualization of the data.
We will solve this by adding the information in an uniform spatial units: hexagons*.
HEXS_RES = 9
hexs_saint_michael = up.geom.gen_hexagons(resolution=HEXS_RES, city=saint_michael)
hexs_port_of_spain = up.geom.gen_hexagons(resolution=HEXS_RES, city=port_of_spain)
hexs_kingston = up.geom.gen_hexagons(resolution=HEXS_RES, city=kingston)
Population data points aggregated in Hexagons
hexs_saint_michael_pop = up.geom.merge_shape_hex(hexs_saint_michael, saint_michael_pop, agg={"brb_general_2020": "sum"})
hexs_port_of_spain_pop = up.geom.merge_shape_hex(hexs_port_of_spain, port_of_spain_pop, agg={"population_2015": "sum"})
hexs_kingston_pop = up.geom.merge_shape_hex(hexs_kingston, kingston_pop, agg={"population_2015": "sum"})
Now population density spatial distribution can be easily visualized in a map.
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
plt.suptitle('Population at H3 resolution 9 (~0.10km^2) hexagons (polygons)')
hexs_saint_michael_pop.plot("brb_general_2020", legend=True, ax=axes[0])
axes[0].set_title('Saint Michael')
axes[0].set_xticks([])
axes[0].set_yticks([])
hexs_kingston_pop.plot("population_2015", legend=True, ax=axes[1])
axes[1].set_title('Kingston')
axes[1].set_xticks([])
axes[1].set_yticks([])
hexs_port_of_spain_pop.plot("population_2015", legend=True, ax=axes[2])
axes[2].set_title('Port of Spain')
axes[2].set_xticks([])
axes[2].set_yticks([])
plt.tight_layout()
plt.show()
The number of hexagons in each city depends on its size. Let's check:
hexs_saint_michael_pop.shape, hexs_port_of_spain_pop.shape, hexs_kingston_pop.shape
((410, 3), (146, 3), (5373, 3))
However, as they are spatial units of the same size and shape, we can compare cities without problems.
With UrbanPy we can get Points of Interest in any city from OpenStreetMap using the overpass_pois function from the ´download´ module.
In this example we will query all health facilities to calculate the city accessibility to health facilities
saint_michael_hf = up.download.overpass_pois(saint_michael.total_bounds, 'health')
port_of_spain_hf = up.download.overpass_pois(port_of_spain.total_bounds, 'health')
kingston_hf = up.download.overpass_pois(kingston.total_bounds, 'health')
The number of healths facilities registered in OpenStreetMaps may vary from city to city. Let's check the number of facilities we obtain in each place.
saint_michael_hf.shape, port_of_spain_hf.shape, kingston_hf.shape
((21, 7), (17, 7), (46, 7))
Visualize the facilities in a map
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
plt.suptitle('Health Facilities extracted from OpenStreetMap (points)')
saint_michael_hf.plot(ax=axes[0])
saint_michael.plot(facecolor='none', ax=axes[0])
axes[0].set_title('Saint Michael')
kingston_hf.plot(ax=axes[1])
kingston.plot(facecolor='none', ax=axes[1])
axes[1].set_title('Kingston')
port_of_spain_hf.plot(ax=axes[2])
port_of_spain.plot(facecolor='none', ax=axes[2])
axes[2].set_title('Port of Spain')
plt.tight_layout()
plt.show()
To calculate accesibility we need to compute travel times from each part of the city (hexagons) to the extracted facilities. UrbanPy allows us to get the full power of a routing engine with just a few lines of code.
Let's review the accessibility analisis function bellow:
def accessibility_analysis(hexagons, pois):
# Calculate the Nearest Facility for each Hexagon
hexagons['lon'] = hexagons.geometry.centroid.x
hexagons['lat'] = hexagons.geometry.centroid.y
dists, ixs = up.utils.nn_search(
tree_features=pois[['lat', 'lon']].values,
query_features=hexagons[['lat', 'lon']].values,
metric='haversine'
)
hexagons["nearest_poi_ix"] = ixs
# Calculate travel times and distances
distance_duration = hexagons.progress_apply(
lambda row: up.routing.osrm_route(
origin=row.geometry.centroid,
destination = pois.iloc[row['nearest_poi_ix']]['geometry']
),
result_type='expand',
axis=1,
)
# Add columns to dataframe
hexagons['distance_to_nearest_poi'] = distance_duration[0] / 1000 # meters to km
hexagons['duration_to_nearest_poi'] = distance_duration[1] / 60 # seconds to minutes
custom_bins, custom_labels = up.utils.create_duration_labels(hexagons['duration_to_nearest_poi'])
hexagons['duration_to_nearest_poi_label'] = pd.cut(hexagons['duration_to_nearest_poi'], bins=custom_bins, labels=custom_labels)
return hexagons
UrbanPy have a simple function to start a routing server in the routing module
up.routing.start_osrm_server('central-america', '', 'foot')
Now we can apply this function to each city
hexs_saint_michael_pop_access = accessibility_analysis(
hexagons=hexs_saint_michael_pop,
pois=saint_michael_hf
)
hexs_kingston_pop_access = accessibility_analysis(
hexagons=hexs_kingston_pop,
pois=kingston_hf
)
hexs_port_of_spain_pop_access = accessibility_analysis(
hexagons=hexs_port_of_spain_pop,
pois=port_of_spain_hf
)
Finally, we will stop the routing server to avoid using too much compute resources while we are not requiring the server
up.routing.stop_osrm_server('central-america', '', 'foot')
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
plt.suptitle('Accessibility to Health Facilities\nat H3 resolution 9 (~0.10km^2) hexagons (polygons)')
hexs_saint_michael_pop_access.plot("duration_to_nearest_poi", cmap='magma_r', legend=True, ax=axes[0])
axes[0].set_title('Saint Michael')
axes[0].set_xticks([])
axes[0].set_yticks([])
hexs_kingston_pop_access.plot("duration_to_nearest_poi", cmap='magma_r', legend=True, ax=axes[1])
axes[1].set_title('Kingston')
axes[1].set_xticks([])
axes[1].set_yticks([])
hexs_port_of_spain_pop_access.plot("duration_to_nearest_poi", cmap='magma_r', legend=True, ax=axes[2])
axes[2].set_title('Port of Spain')
axes[2].set_xticks([])
axes[2].set_yticks([])
plt.tight_layout()
plt.show()
With the help the library contextily we can add a basemap bellow our data. Let's plot the duration in intervals of time:
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
plt.suptitle('Accessibility to Health Facilities\nat H3 resolution 9 (~0.10km^2) hexagons (polygons)')
hexs_saint_michael_pop_access.plot("duration_to_nearest_poi_label", legend=True, cmap='magma_r', alpha=0.5, ax=axes[0])
saint_michael_hf.plot(color='r', marker='P', markersize=40, alpha=0.5, ax=axes[0])
axes[0].set_title('Saint Michael')
axes[0].set_axis_off()
cx.add_basemap(axes[0], source=cx.providers.CartoDB.Positron, crs='EPSG:4326')
hexs_kingston_pop_access.plot("duration_to_nearest_poi_label", cmap='magma_r', alpha=0.5, ax=axes[1])
kingston_hf.plot(color='r', marker='P', markersize=40, alpha=0.5, ax=axes[1])
axes[1].set_title('Kingston')
axes[1].set_axis_off()
cx.add_basemap(axes[1], source=cx.providers.CartoDB.Positron, crs='EPSG:4326')
hexs_port_of_spain_pop_access.plot("duration_to_nearest_poi_label", cmap='magma_r', alpha=0.5, ax=axes[2])
port_of_spain_hf.plot(color='r', marker='P', markersize=40, alpha=0.5, ax=axes[2])
axes[2].set_title('Port of Spain')
axes[2].set_axis_off()
cx.add_basemap(axes[2], source=cx.providers.CartoDB.Positron, crs='EPSG:4326')
legend = axes[0].get_legend()
texts = [e.get_text() for e in legend.get_texts()]
lines = legend.get_lines()
hf_line = Line2D([0], [0], marker='P', color='w', label='Health Places', markerfacecolor='r', markersize=10, alpha=0.5)
lines.append(hf_line)
texts.append('Health Places')
fig.legend(lines, texts, loc='upper right', bbox_to_anchor=(1,0.5))
legend.remove()
# plt.tight_layout()
plt.show()
This statics maps were great! But we can also make some interactive maps and save them as HTML files to share this rich data with other people or as a web.
plotly.offline.init_notebook_mode()
labels = hexs_saint_michael_pop_access['duration_to_nearest_poi_label'].unique().sort_values()
fig = up.plotting.choropleth_map(
title='Access to Health Places in Saint Michael 🇧🇧',
gdf=hexs_saint_michael_pop_access,
color_column='duration_to_nearest_poi_label',
color_discrete_sequence=px.colors.sequential.Magma_r,
category_orders={'duration_to_nearest_poi_label': labels},
labels={'duration_to_nearest_poi_label': ''},
width=400, height=400, zoom=11.5, opacity=0.5,
)
fig.update_layout(
margin=dict(l=0, r=0, b=0),
)
fig.update_traces(marker=dict(line=dict(width=0)))
fig.write_html("outputs/interactive_maps/saint_michael.html")
fig.show()
fig = up.plotting.choropleth_map(
title='Access to Health Places in Kingston 🇯🇲',
gdf=hexs_kingston_pop_access,
color_column='duration_to_nearest_poi_label',
color_discrete_sequence=px.colors.sequential.Magma_r,
category_orders={'duration_to_nearest_poi_label': labels},
labels={'duration_to_nearest_poi_label': ''},
width=400, height=400, zoom=9, opacity=0.5,
)
fig.update_layout(
margin=dict(l=0, r=0, b=0),
)
fig.update_traces(marker=dict(line=dict(width=0)))
fig.write_html("outputs/interactive_maps/kingston.html")
fig.show()